Model Selection

End-to-end ASR

# End-to-end ASR

Parakeet Tdt 0.6b V2 Onnx

NVIDIA Parakeet TDT 0.6B V2 is a model based on automatic speech recognition (ASR) tasks, suitable for English speech-to-text tasks.

Speech Recognition English

Asr Wav2vec2 Commonvoice 14 Es

This is an end-to-end automatic speech recognition system trained on the CommonVoice Spanish dataset, using the wav2vec 2.0 pre-trained model combined with a CTC decoder.

Speech Recognition Spanish

Asr Whisper Medium Commonvoice Ar

A Whisper medium speech recognition model fine-tuned on the CommonVoice Arabic dataset, developed by the SpeechBrain team

Speech Recognition Arabic

Asr Whisper Medium Commonvoice Fa

A fine-tuned whisper medium model based on the CommonVoice-14.0 Persian dataset for Persian automatic speech recognition tasks.

Speech Recognition Other

Faster Whisper Large V2 Japanese 5k Steps

A Japanese automatic speech recognition (ASR) model based on Whisper Large V2, optimized with CTranslate2 for efficient inference.

Speech Recognition

Transformers Japanese

Wav2vec2 Large Xlsr 53 Spanish Ep5 944h

An acoustic model for Spanish automatic speech recognition, fine-tuned for 5 epochs based on facebook/wav2vec2-large-xlsr-53 using approximately 944 hours of Spanish data.

Speech Recognition

Transformers Spanish

carlosdanielhernandezmena

Icefall Asr Gigaspeech Conformer Ctc

Icefall is an automatic speech recognition (ASR) toolkit based on the k2 framework, focusing on efficient and flexible speech recognition model training and inference.

Speech Recognition English

Asr Wav2vec2 Dvoice Wolof

This is an automatic speech recognition model for Wolof, based on the wav2vec 2.0 architecture, trained on the DVoice dataset, supporting Wolof speech transcription.

Speech Recognition Other

Asr Wav2vec2 Dvoice Amharic

This is an automatic speech recognition model for Amharic, trained using wav2vec 2.0 architecture with CTC/Attention mechanism

Speech Recognition Other

Wav2vec2 Large Xlsr Turkish Demo Colab

A speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Wav2vec2 Large Xls R 300m Turkish Colab

A speech recognition model fine-tuned on the CommonVoice Turkish dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Asr Transformer Aishell

A pre-trained end-to-end automatic speech recognition system for Mandarin based on the SpeechBrain framework, featuring a Transformer encoder + joint decoder architecture

Speech Recognition English

Wav2vec2 Large Xlsr Kyrgyz

This is an automatic speech recognition model fine-tuned on the Kyrgyz Common Voice dataset, based on the facebook/wav2vec2-large-xlsr-53 model.

Speech Recognition Other

Wav2vec2 Large Xlsr 53 Turkish

This is an automatic speech recognition (ASR) model fine-tuned on the Turkish Common Voice dataset based on Facebook's wav2vec2-large-xlsr-53 model.

Speech Recognition Other

Wav2vec2 Large Xlsr Estonian

This is an Estonian automatic speech recognition (ASR) model fine-tuned from the facebook/wav2vec2-large-xlsr-53 model, trained using the Common Voice dataset.

Speech Recognition Other

Asr Crdnn Commonvoice Fr

This is an end-to-end automatic speech recognition system trained on the CommonVoice French dataset, utilizing a CRDNN architecture combined with CTC and attention mechanisms.

Speech Recognition French

Wav2vec2 Large Xlsr Thai Demo

A speech recognition model fine-tuned on the Thai Common Voice dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Transformers Other

Asr Wav2vec2 Commonvoice Fr

wav2vec 2.0 speech recognition model trained on the CommonVoice French dataset, using CTC/Attention architecture without requiring a language model

Speech Recognition French

Wav2vec2 Base Turkish Cv7

Turkish automatic speech recognition model based on wav2vec2 architecture, fine-tuned on the Common Voice 7.0 Turkish dataset

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Lithuanian

An automatic speech recognition model fine-tuned for Lithuanian using the Common Voice dataset, based on the facebook/wav2vec2-large-xlsr-53 model.

Speech Recognition Other

Wav2vec2 Xls R 300m Bas CV8 V2

An automatic speech recognition model fine-tuned on the Common Voice 8 dataset based on facebook/wav2vec2-xls-r-300m, supporting Basque (bas).

Speech Recognition

Transformers Other

Wav2vec2 Random

An automatic speech recognition model fine-tuned on the TIMIT_ASR dataset based on the wav2vec2-base-random model

Speech Recognition

patrickvonplaten

Wav2vec2 Large Xls R 300m Hindi Colab

A Hindi speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Wav2vec2 2 Bert Large No Adapter

An automatic speech recognition (ASR) model trained on the LibriSpeech dataset for converting English speech to text

Speech Recognition

Wav2vec2 Large Xlsr Mongolian

This is an automatic speech recognition model fine-tuned on the Mongolian Common Voice dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition Other

Wav2vec2 Base Vietnamese 250h

Vietnamese automatic speech recognition model based on wav2vec 2.0 architecture, trained on 13,000 hours of unlabeled audio and 250 hours of labeled data

Speech Recognition

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase